An Effective Learning Method for Automatic Speech Recognition in Korean CI Patients’ Speech
نویسندگان
چکیده
The automatic speech recognition (ASR) model usually requires a large amount of training data to provide better results compared with the ASR models trained small data. It is difficult apply non-standard such as that cochlear implant (CI) patients, owing privacy concerns or difficulty access. In this paper, an effective finetuning and augmentation proposed. Experiments compare character error rate (CER) after basic proposed method. method achieved CER 36.03% on CI patient’s test dataset using only 2 h 30 min data, which 62% improvement over
منابع مشابه
Automatic recognition of Korean broadcast news speech
This paper describes preliminary results of automatic recognition of Korean broadcast-news speech. We have been working on flexible vocabulary isolated-word speech recognition, and the same HMM models are used for broadcast-news continuous speech recognition. The recognizer is trained by using phonetically balanced isolated words speech, rather than the broadcast news speech itself. In this res...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملAn automatic method for learning a Japanese lexicon for recognition of spontaneous speech
When developing a speech recognition system, one must start by deciding what the units to be recognized should be. This is for the most part a straightforward choice in the case of word-based languages such as English, but becomes an issue even in handling languages with a complex compounding system like German; with an agglutinative language like Japanese, which provides no spaces in written t...
متن کاملPLASER: Pronunciation Learning Via Automatic Speech Recognition
PLASER is a multimedia tool with instant feedback designed to teach English pronunciation for high-school students of Hong Kong whose mother tongue is Cantonese Chinese. The objective is to teach correct pronunciation and not to assess a student’s overall pronunciation quality. Major challenges related to speech recognition technology include: allowance for non-native accent, reliable and corre...
متن کاملSpeech production knowledge in automatic speech recognition.
Although much is known about how speech is produced, and research into speech production has resulted in measured articulatory data, feature systems of different kinds, and numerous models, speech production knowledge is almost totally ignored in current mainstream approaches to automatic speech recognition. Representations of speech production allow simple explanations for many phenomena obser...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronics
سال: 2021
ISSN: ['2079-9292']
DOI: https://doi.org/10.3390/electronics10070807